\section{A Note on Krylov Solvers}

This is just a thought on how Krylov subspace solvers work. Proof may not be serious but just grab an intuitive idea. Given a linear system

\begin{equation}\nonumber
\textbf{Ax} = \textbf{b}
\end{equation}

we can solve it naively just inverse the matrix \textbf{A} and get

\begin{equation}\nonumber
\textbf{x} = \textbf{A}^{-1}\textbf{b}
\end{equation}

The operation is cost and usually people don't do that especially for large systems. Various classes of methods existing there and here we just talk about iterative methods based on Krylov subspace. According to Cayley-Hamilton theorem, we can represent an invertible matrix's inverse as a linear comibination of its powers. Something like

\begin{equation}\nonumber
\textbf{A}^{-1} = \sum_{k} c_k \textbf{A}^k
\end{equation}

k could be zero which leads to the identity matrix. Then we have

\begin{equation}\nonumber
\textbf{x} = \textbf{A}^{-1}\textbf{b} = \sum_{k} c_k \textbf{A}^k\textbf{b}
\end{equation}

usually in a finite precision condition and with an incomplete expansion for the inverse (k $<$ n), we have

\begin{equation}\nonumber
\textbf{x}_k = \sum_{k} c_k \textbf{A}^k\textbf{b}
\end{equation}

which means we search in a subspace, namely Krylov subspace, composed of $\{\textbf{b}, \textbf{A}\textbf{b}, \cdots , \textbf{A}^k\textbf{b}\}$ to find an appropriate approximation. One of the popular criteria defines \textit{approriate} is to minimize $\|\textbf{b}-\textbf{Ax}_k\|_2$. So far so good. However, in most of the cases, $\textbf{A}^k\textbf{b}$ in the Krylov subspace quickly become linearly dependent, clustered in the direction of the eigenvector with maximum eigenvalue. Thus an orthogonalization process is needed. Like \textit{Lanczos} iteration for symmetric problems and \textit{Arnoldi} for more general cases.